Skip to content

perf: use CoW (copy-on-write) cloning for directory copies#122

Merged
helizaga merged 6 commits intocoderabbitai:mainfrom
arwtyxouymz:feat/faster-copy
Feb 17, 2026
Merged

perf: use CoW (copy-on-write) cloning for directory copies#122
helizaga merged 6 commits intocoderabbitai:mainfrom
arwtyxouymz:feat/faster-copy

Conversation

@arwtyxouymz
Copy link
Contributor

@arwtyxouymz arwtyxouymz commented Feb 13, 2026

Summary

  • Add _fast_copy_dir() to lib/copy.sh that uses CoW file cloning when supported by the filesystem
    • macOS APFS: cp -cRP (clone) with fallback to cp -RP
    • Linux Btrfs/XFS: cp --reflink=auto -RP (auto-fallback built in)
    • Other filesystems: standard cp -RP (no behavior change)
  • Replace cp -RP in copy_directories() with _fast_copy_dir
  • Add 3 BATS tests for _fast_copy_dir (contents, symlinks, error handling)

Motivation

Copying large directories like node_modules or .venv via gtr.copy.includeDirs can be slow. On CoW-capable filesystems (APFS, Btrfs, XFS), file cloning is near-instant regardless of directory size, as only metadata is copied.

Risk

  • Zero regression on non-CoW filesystems (ext4, NTFS, etc.) — all paths fall back to the current cp -RP behavior
  • Individual file copies (copy_patterns) are intentionally unchanged, as CoW overhead outweighs benefit for small config files

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores

    • Optimized directory copying to prefer copy-on-write where available, with OS-specific fallbacks, cleanup for partial copies, symlink preservation, unified logging, and preserved default behavior on systems without CoW.
  • CI

    • Added a workflow job to verify generated completion files are up to date.
  • Tests

    • Added safety tests for copy behavior, symlink preservation, and handling of missing sources.

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)

Walkthrough

Adds _fast_copy_dir() and an _fast_copy_os cache to lib/copy.sh for OS-aware Copy-on-Write-capable directory copies, updates copy_directories() to use it and reuse computed find results, adds tests for _fast_copy_dir(), and a CI job to verify generated completions.

Changes

Cohort / File(s) Summary
Copy-on-Write Implementation
lib/copy.sh
Adds _fast_copy_dir src dest and _fast_copy_os cache; implements OS-specific CoW attempts (Linux: cp --reflink=auto -RP; macOS: cp -cRP then cleanup+fallback to cp -RP; others: cp -RP), preserves symlinks, replaces inline cp -RP in copy_directories(), and computes find results once per pattern.
Tests
tests/copy_safety.bats
Initializes _fast_copy_os in setup(), adds teardown() cleanup, and adds tests for _fast_copy_dir() covering content/structure copy, symlink preservation (via readlink), and failure when source is missing.
CI / Lint
.github/workflows/lint.yml
Adds a new "Completions" GitHub Actions job that checks out the repo and runs ./scripts/generate-completions.sh --check to verify completion files are up to date.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Script (copy_directories)
  participant Helper as _fast_copy_dir()
  participant OS as OS cp
  participant FS as Filesystem

  Caller->>Helper: copy(src, dst)
  Helper->>Helper: detect/read cached OS (_fast_copy_os)
  alt Linux
    Helper->>OS: cp --reflink=auto -RP src dst
  else macOS
    Helper->>OS: cp -cRP src dst
    alt partial CoW created / failure
      Helper->>FS: remove partial dst
      Helper->>OS: cp -RP src dst
    end
  else Other
    Helper->>OS: cp -RP src dst
  end
  OS->>FS: perform copy (reflink or full)
  Helper->>Caller: return status and log result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through folders, gentle and spry,

Clones that whisper, lighter than a sigh.
Symlinks kept snug, structure held tight,
A clever copy, done just right. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding CoW cloning support for directory copies. It is specific, concise, and clearly identifies the primary performance improvement.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@lib/copy.sh`:
- Around line 88-93: The darwin branch can leave a partially-populated
destination when cp -cRP "$src" "$dest" fails, so modify the fallback to clean
the destination before retrying: after the failed cp -cRP "$src" "$dest"
attempt, detect and remove the partially-created destination (e.g., rm -rf
"$dest" only when it was created by the failed attempt) or perform the initial
clone into a temporary directory and then atomically rename/move it into
"$dest", then run cp -RP "$src" "$dest" as the fallback; update the darwin case
where cp -cRP and cp -RP are invoked to implement this cleanup/temporal-staging
logic.
🧹 Nitpick comments (2)
lib/copy.sh (1)

84-85: detect_os is invoked on every call; consider caching for loop-heavy callers.

copy_directories calls _fast_copy_dir once per matched directory. Each invocation spawns a subshell for detect_os. You could cache the result in a module-level variable on first call.

Proposed optimization
+# Cached OS value for _fast_copy_dir; set on first call.
+_fast_copy_os=""
+
 _fast_copy_dir() {
   local src="$1" dest="$2"
-  local os
-  os=$(detect_os)
+  if [ -z "$_fast_copy_os" ]; then
+    _fast_copy_os=$(detect_os)
+  fi
+  local os="$_fast_copy_os"
tests/copy_safety.bats (1)

89-101: Temp directory cleanup won't run if an assertion fails.

If [ -f "$dst/mydir/sub/file.txt" ] fails on line 98, lines 100 (rm -rf) is never reached. Same pattern in the other two tests. Consider using a BATS teardown function or the $BATS_TEST_TMPDIR built-in (BATS 1.4+) for automatic cleanup.

Example using teardown
+teardown() {
+  # Clean up any temp dirs created during tests
+  [ -n "${_test_tmpdir:-}" ] && rm -rf "$_test_tmpdir"
+}
+
 `@test` "_fast_copy_dir copies directory contents" {
-  local src dst
-  src=$(mktemp -d)
-  dst=$(mktemp -d)
+  _test_tmpdir=$(mktemp -d)
+  local src="$_test_tmpdir/src" dst="$_test_tmpdir/dst"
+  mkdir -p "$src" "$dst"
   mkdir -p "$src/mydir/sub"
   echo "hello" > "$src/mydir/sub/file.txt"
 
   _fast_copy_dir "$src/mydir" "$dst/"
 
   [ -f "$dst/mydir/sub/file.txt" ]
   [ "$(cat "$dst/mydir/sub/file.txt")" = "hello" ]
-  rm -rf "$src" "$dst"
 }

Copy link
Collaborator

@helizaga helizaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice contribution! The CoW optimization is well-scoped, correctly handles fallbacks, and addresses a real performance need for includeDirs users copying large directories like node_modules.

Review notes:

  • All 239 tests pass (3 new + 236 existing)
  • Manually verified: directory copy works, symlinks preserved, CoW cloning active on macOS APFS
  • set -e safe at the call site (guarded by if)
  • Fallback chain is correct on all platforms

Fix pushed: I added a commit (a0ce2f5) to fix a pre-existing Bash 3.2 bug discovered during testing — copy_directories() had a case inside $() inside a heredoc, which Bash 3.2 (macOS system shell) can't parse. This caused copy_directories() to silently fail on every invocation. The fix hoists the find command into a variable before the heredoc. This bug existed on main and was not introduced by this PR, but fixing it here means the CoW optimization actually works end-to-end on macOS.

arwtyxouymz and others added 4 commits February 17, 2026 11:02
Add _fast_copy_dir() that leverages filesystem-level cloning
(macOS APFS cp -c, Linux cp --reflink=auto) for near-instant
directory copies on supported filesystems, with automatic
fallback to standard cp -RP on ext4/NTFS/others.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Clean up partial clone output before fallback in Darwin _fast_copy_dir
- Cache detect_os result to avoid repeated subshell calls
- Use teardown function in copy_safety tests for reliable tmpdir cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bash 3.2 cannot parse case/esac inside $() inside heredocs,
causing copy_directories() to silently fail on macOS system Bash.
Move the find command into a variable before the heredoc.
- Introduced a new job in the lint workflow to check if completion files are up to date.
- Updated the generate-completions.sh script to include a new option for color output mode.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/copy.sh (1)

78-90: Consider resetting the cache in tests or documenting the stale-cache risk.

_fast_copy_os is a module-level global that persists for the lifetime of the process. If detect_os could ever return different results (e.g., during testing with mocked OSTYPE), the cached value will be stale. This is unlikely in production but could cause confusing test failures.

A simple _fast_copy_os="" in test setup() would make tests deterministic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/copy.sh` around lines 78 - 90, The module-level cache variable
_fast_copy_os used by _fast_copy_dir can become stale during tests that mock
detect_os; add a small cache-reset mechanism and documentation so tests remain
deterministic: create a tiny helper function (e.g., _fast_copy_reset_cache) that
sets _fast_copy_os="" and call that in test setup (or alternatively move the
cache into a function-scoped static or document the stale-cache risk in the
module comment), and reference _fast_copy_os, _fast_copy_dir and detect_os when
updating tests/docs so maintainers know how to clear the cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/copy.sh`:
- Around line 98-101: The short-circuit "[ -e \"$ _clone_target\" ] && rm -rf
\"$ _clone_target\"" can fail under set -e and abort before the cp -RP fallback;
change this to an explicit guarded check such as: if [ -e "$_clone_target" ];
then rm -rf "$_clone_target"; fi (or append "|| true" to the rm command) so that
absence of the target does not cause the function to exit and the subsequent cp
-RP "$src" "$dest" still runs.

---

Nitpick comments:
In `@lib/copy.sh`:
- Around line 78-90: The module-level cache variable _fast_copy_os used by
_fast_copy_dir can become stale during tests that mock detect_os; add a small
cache-reset mechanism and documentation so tests remain deterministic: create a
tiny helper function (e.g., _fast_copy_reset_cache) that sets _fast_copy_os=""
and call that in test setup (or alternatively move the cache into a
function-scoped static or document the stale-cache risk in the module comment),
and reference _fast_copy_os, _fast_copy_dir and detect_os when updating
tests/docs so maintainers know how to clear the cache.

- Use if/fi instead of && for partial-clone cleanup to avoid exit
  code 1 leaking under set -e when target doesn't exist
- Reset _fast_copy_os in test setup for deterministic test isolation
@helizaga helizaga merged commit 5bc568f into coderabbitai:main Feb 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments